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Abstract 

We use a replica approach to deal with portfolio optimization problems. A given risk measure 
is minimized using empirical estimates of asset values correlations. We study the phase transition 
which happens when the time series is too short with respect to the size of the portfolio. We also 
study the noise sensitivity of portfolio allocation when this transition is approached. We consider 
explicitely the cases where the absolute deviation and the conditional value-at-risk are chosen as a 
risk measure. We show how the replica method can study a wide range of risk measures, and deal 
with various types of time series correlations, including realistic ones with volatility clustering. 
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I. INTRODUCTION 

The portfolio optimization problem dates back to the pioneering work of Markowitz 
and is one of the main issues of risk management. Given that the input data of any risk 
measure ultimately come from empirical observations of the market, the problem is directly 
related to the presence of noise in financial time series. In a more abstract (model-based) 
approach, one uses Monte Carlo simulations to get "in-sample" evaluations of the objective 
risk function. In both cases the issue is how to take advantage of the time series of the 
returns on the assets in order to properly estimate the risk associated with our portfolio. 
This eventually results in the choice of the risk measure, and a long debate in the recent years 
has drawn the attention on two important and distinct clues: the mathematical property 
of coherence j^J, and the noise sensitivity of the optimal portfolio. The rational behind the 
first of these issues lies in the need of a formal (axiomatic) translation of the basic common 
principles of risk management, like the fact that portfolio diversification should always lead 
to risk reduction. Moreover, requiring a risk measure to be coherent implies the existence 
of a unique optimal portfolio and a well-defined variational principle, of obvious relevance 
in practical cases. The second issue is also a very delicate one. In a realistic experimental 
set-up, the number N of assets included in a portfolio can be of order 10 2 to 10 3 , while 
the length of a trustable time series hardly goes beyond a few years, i.e. T ~ 10 3 . A good 
estimate of any extensive observable would require the condition N/T <C 1 to hold, but this 
is rarely the case. Instead, the ratio of assets to data points, N/T, will be considered as a 
finite number. 

In this note we address analytically the risk minimization problem by studying the depen- 
dence of the optimal portfolio on the ratio N/T and on other potential external parameters. 
We first assume that the real distribution of returns is multinormal in order to keep the 
problem tactable from the analytical point of view. Generalizations to more realistic re- 
turns distributions are also presented. Our approch consists in writing down the empirical 
estimate of the risk measure and then reformulating the problem from the point of view 
of the statistical physics. We work out the analytical solution by means of the replica 
method j^] and thus get some insights on the optimal portfolios. The analytical solution 
confirms previous results on the existence of a phase transition j^j. The ratio N/T plays 
the role of a control parameter. When it increases, there exists a sharply defined threshold 
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value where the estimation error of the optimal portfolio diverges. A first account of our 
method, limited to the Expected Shortfall risk measure, has appeared in ref. 0. Here we give 
a more general presentation, studying other risk measures and more realistic distributions 
of returns. 

The paper is organized as follows. In section |H] we introduce the notations we will use 
throughout the paper and we formulate the problem in its general mathematical form. In 
section ITTT1 we consider the case of the absolute deviation (AD) We present the replica 
calculation of the optimal portfolio and compute explicitely a noise sensitivity measure 



introduced in ref. llO- In section IIVI we deal with portfolio optimization under Expected 
Shortfall (3, Ql , which was shown to have a non-trivial phase diagram 0] and then studied 
analytically |5j. The striking point is that, for some values of the external parameters of 
the problem, the minimization problem is not well defined and thus cannot admit a finite 
solution. We investigate here the same feature while considering realistic distribution of 
returns, so as to take into account volatility clustering. The replica approch then turns into 
a semi-analytic and extremely versatile technique. We discuss this point and then summarize 
our results in section El 

II. THE GENERAL SETTING 

We denote our portfolio by w = {wi, . . . w^}, where Wi is the position on asset i. We do 
not impose any restriction to short selling: Wi is a real number. The global constraint induced 
by the total budget reads Y2i w i = A", where, due to a later mathematical convenience, we 
have chosen a slightly different normalization with respect to the previous literature. Calling 
Xi the return of the asset i and assuming the existence of a well-defined probability density 
function (pdf) p(xi, . . . x^), one is interested in computing the pdf of the loss £ associated 
to a given portfolio, i.e. 



J |7 dxj p(xi, . . . x N ) 8 \ i + ^ WjxA . (1) 



The complete knowledge of this pdf would lead to the precise, though still probabilistic, 
evaluation of the loss, thus allowing for a straightforward optimization over the space of 
legal portfolios. This is actually a pretty difficult task and one usually restricts to some 
characteristic of this pdf (e.g. its first moments, its tail beahvior), so as to capture the 



consequences of extremely bad events in the global loss. The actual p(xi, . . . xn) is not known 
in general, and integrals like the one in (0) are usually estimated by time series, coming 
from market oservations or synthetically produced by numerical simulations. Whatever the 
chosen risk measure then, one typically faces cost functions (to be optimized over all possible 
portfolios) like 



risk(w;iV,T,A) = i^^ A 

T=X 



N 

(r) 



£ 

i=l 



WiX\ 



(2) 



where {x\ } is the whole time series of the return i and where we denoted by 

A other possible external parameters of the risk measure. The best known example of risk 
measure is of course the variance, as first suggested by Markowitz. In that case the risk 
function is obtained by taking J-"\(z) = z 2 in (j2J). The evaluation of the variance implies 
an empirical evaluation of the covariance matrix of the underlying stochastic process, 
and thejextremely noisy character of any estimation of has been underlined a few years 



ago 




However, recent studies [10, llll have shown that the effect of the noise on 



the actual portfolio risk is not as dramatic as one might have expected. More in detail, a 
direct measure of this effect was introduced and explicitely computed in the simplest case of 
a ij = I n the nex t section, we compute the same quantity as far as the absolute deviation 
of the loss is concerned. 

In the statistical physics approach, one studies the limit N, T — > oo, while N/T = 1/t is 
finite. One introduces the partition function at inverse temperature 7: 

)[t,A;{x 4 (T) }] = [f[dWi e -^[w;AWA] Jj2 Wi _ N \ > (3) 
J i=i \i=i / 

from which any observable will be computed. For instance, the optimal cost (i.e. the 

minimum of the risk function in (J2J) is computed from 

e(t,A) = lim -J- min risk[w; N, Nt, A] = lim — lim — log Z^ N) [t, A; {x^ }} . (4) 

It turns out that this expression depends on the actual sample (the time series {x^ }) used 
to estimate the risk measure. We are mainly interested in the average over all possible time 
series of this quantity, which we assume to be narrowly distributed around its mean value. 
Taking the average of eq. (jlj) means that we have to average the logarithm of the partition 
function according to the pdf p({x- r ^}). The so-called replica method allows to simplifiy this 
task as follows. We compute E [Z n ] for integer n and assume we can analytically continue 



this result to real n: then E [log Z] = lim n ^o(E [Z n ] — l)/n. This is the strategy that we are 
going to use in the next sections and that will allow to compute the optimal portfolio. 



III. REPLICA ANALYSIS: ABSOLUTE DEVIATION 

The absolute deviation measure AD[w; N,T] is obtained by choosing J-'x(z) = \z\ in (j2J). 
No other external parameters A are present here. We assume a factorized distribution 

p[{4"}]~n-p(-^f ) . (5) 

where the volatilities {c T } are distributed according to a pdf which we do not specify for 
the moment. Following the replica method, we introduce n identical replicas of our portfolio 
and compute the average of Z n : 

/n 
JJ dQ ab dQ ab e N ^-"^ (Q ab - l )Q ab - f Tr lo § Q- 1 Tr lo § Q+E T log a 7 ({q^} ;CTt ) ? 

(6) 

A 7 ({Q ab };a T )= [ f[du a r ex V { - -L £ (g -i)-y u 6 _ 7 ^ K | 1 j 

7 a=l ^ ^ afe a J 

where we have introduced the overlap matrix 

i N 

Q ab = xT, w >i i a,b=l,...n, (7) 
i=i 

as well as its conjugate Q ab , the Lagrange multipliers introduced to enforce (J7J). In the limit 
N,T — > oo, AT/T — 1/t finite, the integral in © can be solved by a saddle point method. 
Due to the symmetry of the integrand by permutation of replica indices, there exists a 
replica-symmetric saddle point j^: Q aa = qi, Q ab = qo for a ^ b, and the same for Q ab . We 
expect the saddle point to be correct in view of the fact that the problem is linear. Under 
this hypothesis, which will be only justified a posteriori by a direct comparison to numerical 
data, the replicated partition function in (0) gets simplified into 

E [Z:;(t)} ~ J dq J dAqexp [Nn S^(q , Aq)(l + 0{n))] , (8) 

_ , . . (1 — t)q — 1 1 — t . . 1 v-^ 1 , . , . . 

S y {q ,Aq) = ttt + — — hgAq + t — > -log A^(q , Aq; a T ) , 

2Aq 2 1 ' n 

A 7 (g ,Ag ;( x T ) = / -J^ e ~° 2 /^ i +n [ du e~^? + ^~ lM + 0(n 2 
J v 27r <?o L J 



where Aq = qi — qo and n is the number of replicas (which will eventually go to zero). We 
now assume that in the low temperature limit the overlap fluctuations are of order I/7 and 
introduce A = '-fAq. One can show that if A stays finite at low temperatures 

lim lim — log AJq , A/ T ,a T ) = A 2 a 3 T / ds e -^Wl^{\ _ s )2 . (9) 

n— »0 7— >oo XV) J 1 

For the sake of clarity, we focus on the simple case a T = 1 Vr. In the 7 — > 00 limit, the 
saddle point equations for (jHJ) are 

~ = erf(l/>/2g&) , (10) 

-1/2 

, (11) 



A = \2t 



l^A, ;+ ^ e -vv„_(l±« (l _ erf(l/ . 



where go = <?o^ 2 - The minimum cost function, i.e. the average of eq. is found to be 
e(t) = 1/A. Notice that (|10|) only admits a solution for t > 1. There is no solution to the 
minimization problem if the ratio of assets to data points, N/T, is smaller than 1. On the 
other hand, once this condition is fulfilled, the equation (jlljl gives a finite A at any t > 1. 
The asymptotic behaviour of e(t) can be worked out analytically: we introduce 5 = 1 — 1/t 
and consider the limit 8 <C 1. This leads to 



The full solution and a comparison with numerics are shown in Fig. ^ (left). 

We now address the issue of noise sensitivity, for which a measure was introduced inliol 
The idea is the following: Assume you know the true pdf of the loss (0) and you get some 
optimal w(°) by minimizing the absolute deviation of I. We want to compare the optimal 
risk associated to w^ ) with the one obtained by optimizing (J2J), i.e. the empirical estimation 
of the same risk measure. A fair comparison is then qx — 1, with 

q 2 (N,T) = L r ' , J , 13 

where the w* refer to the portfolio obtained by minimizing ((21). This is the quantity which 
we have computed by the replica approach. In our calculation we have assumed to deal 
with a factorized Gaussian distribution of returns (extensions to more realistic cases will 
be presented in the next section) and it is straightforward to prove that in this case qx = 
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FIG. 1: Left: The analytic solution e(t) is compared with the results of numerical simulations, 



where the constrained optimization is computed directly via linear programming methods 



Q. 



Right: Numerical results for y^2iLi(w*) 2 compared to the analytic behaviour y^A. The curve 
denoted by qx (var) represents the behaviour of qx in the variance minimization problem. 



Sili^*) 2 - This corresponds in our language to = y/q' A, which diverges like (1 — 
l/t) -1 ' 2 as l/t — > \~ . Corrections to this leading behavior (which is instead the full shape 
of qx in the variance minimization problem) are needed in order to reproduce the data 
(right panel of Fig. [IJ. The comparison with the Markowitz optimal portfolio (variance 
minimization) indicates that the AD measure is actually less stable to perturbations: A 
geometric interpretation of this result can be found in ref. y. Beside this fact, the interesting 
result is then the existence of a well defined threshold value t — 1 at which the estimation 
error becomes infinite. This is due to the divergence of the variance of the optimal portfolio 
in the regime t < 1, where any minimization attempt is thus totally meaningless. 



IV. EXPECTED SHORTFALL 
A. The minimization problem 

For a fixed value of f3 < 1 (/3 > 0.9 in the interesting cases) the expected-shortfall (ES) 
of a portfolio w is obtained by choosing T{z) oc z8(z— VaR) in (j2j), where VaR stands here 
for the Value-at-Risk Q|. In practice, it is computed from the minimization of a properly 
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chosen objective function 



Q: 



ES[w;JV,r,/3]=min^ 1 , + Tr -^J] 



N 



WiX\ 

i=l 



+ 



(14) 



T=l 

where [a] + = (a + |a|)/2. Optimizing the ES risk measure over all the possible portfolios 
satisfying the budget constraint is equivalent to the following linear programming problem: 

• Cost function: E — (1 — j3)Tv + Y1t=i u t > 

• Variables: Y = {wi, . . . wn, u±, . . . Ut, v} ; 

• Constraints: u t > , u t + v + J2i=i x a w i 

>0, Eli^ = N . 

In a previous work we solved the problem in the case where the historical series of returns 
is drawn from the oversimplified probability distribution (jHJ), with oy = 1 Vr. Here we do a 
first step towards dealing with more realistic data and assume that the series of returns can 
be obtained by a sequence of normal distributions whose variances depend on time: 



V 



[{a t }] ~ ] Jexp (-a T a T ,G;i,)l[q(a T ) , (15) 



for some long range correlator G T y which takes into account volatility correlations, and 
q(cr T ) equal e.g. to a lognormal distribution. 



B. The replica solution 



A straightforward generalization of the replica calculation presented in ref. |5| (and 
sketched in the previous section for a similar problem) allows to compute the average optimal 
cost for a given volatility sequence {a±, . . . cjt}, in the limit when N, T — > oo and N/T — 1/t 
stays finite. This is given by 



mm 



— + A e(t,(3;v,q \{a T }) 



(16) 



e(t,ftv,qo\{(T T }) = t(l-(3)v-^ + 



1 p+oo 

/ ds e~ s2 g(v/a T + s^j2%;a T ) , (17) 

i J —oo 



2 20FT^ 

v T=l 

where A = lim 7 ^oo jAq and the function g(x; a) is equal to x 2 if — a < x < 0, to — lax — a 2 
is x < —a, and otherwise. The minimization over v, qo implies that 



de/dv = de/dq = . 



(18) 



As discussed in |5[ , the problem admits a finite solution if ()17|) is minimized by a finite value 
of A. The feasible region is then defined by the condition e(t, j3; v, q\{<Jt}) > , where v 
and q satisfy (fTgj). This theoretical setup suggests the following semi-analytic protocol for 
determining the phase diagram of realistic portfolio optimization problems. 

1. Fix a value of (3 G [0, 1], and take N equal to the portfolio size you are interested in. 

2. For T = T min to T max , such that N/T e [0.1, 0.9], do the following: 

(a) Generate a sequence {cxi, 02, . . . or} according to ([T5|) and compute the e function 
in dnj). 

(b) Minimize e with respect to v and qo according to (j!8j) . 

(c) Repeat steps (a) and (b) for n samples, and compute the mean value (e). 

3. Plot (e) vs. N/T and find the value (N/T)* where this function changes its sign. 

By repeating this procedure for several values of f3 we get the phase separation line (N/T)* 
vs. (3. 



C. Results 



A simple way of genera ting realistic volatility series consists in looking at the return time 



series as a cascade process 



3 



the volatility 



15| . In a multifractal model recently introduced 
covariance decreases logarithmically: this is achieved by letting a T = exp£ T , where £ r are 
Gaussian variables and 

(£ r ) = -A 2 logT cut , (Ur') ~ <a = A 2 log — ^ , (19) 

1 + |T — T\ 

A quantifying volatility fluctuations (the so-called 'vol of the vol'), and T cut being a large 
cutoff. A few samples generated according to this procedure are shown in Fig. |21 

The phase diagram obtained for different values of A 2 is shown in Fig. El A comparison 
with the phase diagram computed in absence of volatility fluctuations shows that, while 
the precise shape of the separating curve depend on the fine details of the volatility pdf, 
the main message has not changed: There exists a regime, N/T > (N/T)*, where the small 
number of data with respect to the portfolio size makes the optimization problem ill-defined. 
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FIG. 2: The first three panels show 3 realizations of volatility sequences of length T = 1024 
according to the model (|19|) . Different panels correspond to different values of A 2 . The last panel 
is a logarithmic representation of the A 2 = 0.40 data. 

In the "max- loss" limit (3 — > 1, where the single worst loss contributes to the risk measure, 
the threshold value (N/T)* = 0.5 does not seem to depend on the volatility fluctuations. 
As P gets smaller than 1, though, the presence of these fluctuations is such that the feasible 
regione becomes smaller than the ideal multinormal case. 



V. CONCLUSIONS 



In this paper we have discussed the replica approach to portfolio optimization. The rather 
general formulation of the problem allows to deal with several risk measures. We have shown 
here the examples of absolute deviation, expected shortfall and max-loss (which is simply 
taken as the limit case of ES). In all cases we find that the optimization problem, when 
the risk measure is estimated by using time series, does not admit a feasible solution if the 
ratio of assets to data points is larger than a threshold value. As discussed in ref. U, this is 
a common feature of various risk measures: the estimation error on the optimal portfolio, 
originating from in-sample evaluations, diverges as a critical value is approached. In the 
expected shortfall case, we have also discussed a semi-analytic approach which is suitable 
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FIG. 3: The phase diagram corresponding to different values of the parameter A 2 . The full line 
corresponds to the absence of fluctuations in the volatility distributions (i.e. a T = 1 Vr). 

for describing realistic time series. Our results suggest that, as far as volatility clustering is 
taken into account, the phase transition is still there, the only effect being the reduction of 
the feasible region. As a general remark, we have shown that the replica method may prove 
extremely useful in dealing with optimization problems in risk management. 
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